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I. CERTAIN THINGS SHOULD NOT HAPPEN 



Like many people working in quantum information sci- 
ence, Bob had spent a few weeks in the Centre for Quan- 
tum Technologies in Singapore, collaborating with Alice. 
Some time after he left, Alice finished preparing ten tu- 
torials for her module on quantum biology. She thought 
of sharing them with Bob, who was preparing to teach 
a similar module in his university. However, the latest 
policies allow only 1Mb attachment per year to an e-mail 
[29|, and each tutorial alone amounts at 1Mb. Alice is 
in a dilemma: which tutorial will be the best for Bob? 
It would be much simpler to let Bob choose. But this 
means that the information about all the tutorials must 
be made available in Bob's location. How can that hap- 
pen by sending only a much smaller amount of informa- 
tion? 

Alice remembers having shared with Bob, when he was 
in Singapore, a one-time pad key and even several qubits 
maximally entangled with hers. Quantum channels can 
perform tasks that appear incredible to the classically- 
minded. Can then these shared resources be helpful 
for this specific task? Alice does not believe it: she 
knows that shared randomness and entanglement are no- 
signaling resources. So, she argues, how could they help 
in sending new information, like the tutorials, which did 
not even exist at the time of the sharing? 

In this text, we show that Alice's argument is wrong: 
no-signaling resources could in principle solve that task. 
Her final conclusion is nevertheless correct: the no- 
signaling resources that exist in our world cannot solve 
that task. Why? It is probably beyond physics to an- 
swer this question. Maybe simply because certain things 
should not happen? 



II. THE CONTEXT 



A. Defining quantum physics 



Definire means to find the boundary. In order to define 
quantum physics, therefore, one can't invoke the "typi- 
cally quantum" notions of coherence and entanglement: 
if anything, these notions fix the boundaries of classical 
physics. One really needs to go at the quantum finis 
terr<£. However, all known natural phenomena can be 
made to fit in the quantum framework. So, are there any 
boundaries to be found at all? 

We leave the question open regarding boundaries in 
nature. But there are certainly boundaries in the world 
of physical theories. In quantum theory: (i) physical 
systems must be described by Hilbert spaces, their pure 
states by one-dimensional projectors, with the rule that 
orthogonal vectors describe fully distinguishable states; 
and (ii) the evolution in time must be reversible. As 
well known by now, pretty much all the formalism stems 
from these two requirements: a clear boundary, a sharp 
definition, and a very successful one. However, curiosity 
is not assuaged: recipes (i) and (ii) define a boundary 
with what? What is there outside? How would physics 
be if (i) and (ii) would not be true? 



B. No-signaling is not enough 



No-signaling as a principle 



It is far from easy to invent decent, consistent answers 
to the previous questions. Even the anarchical freedom 
of science fiction has ultimately produced a single cre- 
ative alternative: signaling, in all its possible variations 
(faster-than-light travel, teleportation of matter between 
distant locations, etc). No-signaling is certainly a bound- 
ary, and a very constraining one at that: just think how 
tiny is the portion of the universe that the human kind 
may hope to visit, unless a family of kind wormholcs 
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FIG. 1: The representation of a bipartite no-signaling prob- 
ability distribution, or "no-signaling box", used in this text. 
The wavy line is not meant as a material connection, but 
only as a reminder of the existence of correlations. The PR- 
box is defined by x,y,a,b £ {0,1}, random marginals i.e. 
P(a\x) — P(b\y) — i, and perfect correlations satisfying 
a © 6 = xy. 

comes to rescue. So let us take this single suggestion se- 
riously: is no-signaling the physical principle that defines 
our (quantum) universe? 

Popcscu and Rohiiich were the first to raise this ques- 
tion explicitly, and to find a negative answer The 
counter-example uses a simple mathematical object that 
had been described some years earlier by Rastall 
nowadays it is customarily referred to as the PR-box |30j. 



2. The PR-box and the CHSH game 



The PR-box is a specific bipartite no-signaling prob- 
ability distribution with both binary input and output 
(Figure [l}. Alice can input a bit x and receives a bit a as 
output; and similarly Bob can input a bit y and receives 
a bit b as output. The PR-box is specified by the rule 

P PR (a,b\x,y) = ljd am=xy , (1) 

where the symbol © indicates sum modulo 2. In other 
words, a and b are always locally random; they are equal 
in the three cases (x,y) = (0,0), (0,1) and (1,0), while 
they are different when (x,y) = (1, 1). 

The PR-box is tailored to violate maximally the 
Clauscr-Horne-Shimony-Holt (CHSH) Bell inequality. 
For the purpose of this paper, we present this criterion as 
the CHSH game. Alice and Bob are given two binary in- 
puts and must produce, without communication, binary 
outcomes satisfying |T]). If the inputs are distributed ran- 
domly, the probability of success is 

I 1 

PCHSH = J p ( a © b = x v\ x >v) ■ ( 2 ) 

x,y—0 

If Alice and Bob are allowed to use only classical shared 
randomness, their winning probability is bounded as 



Pchsh < Pc = |- If they can share entanglement, 
their winning probability is increased up to the Tsirelson 
bound [3] 

PCHSH < PQ = ~ 85 % ( 3 ) 

which is still smaller than one. By construction, the PR- 
box reaches pchsh = 1. 

This simple argument proves that no-signaling cannot 
be the only physical principle that defines our quantum 
world. At least another constraint is in place, that lim- 
its the probability of success of the CHSH game. We 
can thus rephrase the questions of our curiosity: given 
that we live in a world, in which Bell's inequalities are 
violated, why are they then not violated as much as no- 
signaling would allow? Any physical principle (or col- 
lection thereof) claiming to come close to a definition of 
quantum physics should be able to deal with the riddle 
of the Tsirelson bound. 



C. Mathematical framework 



We focus on an operational generalization of quantum 
kinematics (states and measurement, without dynamics). 
The measurement process is defined as "choosing an in- 
put and getting an output" . The information about the 
state of the system is contained in the observed probabil- 
ity distributions of the outputs, for each input. Since we 
focus on bipartite systems, let us fix the notations: the 
inputs of Alice and Bob are written x € X and y G y, 
respectively; the outputs (we assume that every input 
leads to the same number of possible outcomes) are writ- 
ten a <G A and b £ B respectively. So, for each x, y, 
Alice and Bob can reconstruct the probability distribu- 
tion P xy = [P(a,b\x,y)\a e A,b £ B). All that Alice 
and Bob know about the system and the measurements 
is captured by the probability point 

V = {P xy \xeX,yey} . (4) 

A priori, each P xy is specified by |^4| \B\ — 1 values because 
of normalization; therefore V is generically specified by 
|^||y|(|>t||B| - 1) values. 

For the following, it is important to classify probability 
points as follows: 

• V belongs to the classical set if it can be written as 
a convex combination of local deterministic points, 
i.e. points of the form P(a, b\x, y) = 5 a= f( x )6b=g( y ). 
These points are the extremal points of the classical 
set; since there are finitely many of them, namely 
\A\ lxl \B\W, the classical set is a polytope. In sum- 
mary, the classical polytope contains all the V that 
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can be obtained from "local (hidden, or not hidden) 
variables" . 

• V belongs to the quantum set if there exist a state 
p and projectors {IT^, 11^} such that 



(5) 



P(a,b\x,y) = Tr(pII^) , 

where the projectors must satisfy [11^,11^] = 
[n^II*] = [U v b ,U v b ,] = for all a,b,x,y. There is 
no loss of generality in considering only projective 
measurements, since the dimensionality of p is not 
restricted. For finite-dimensional Hilbert spaces, 
these relations between projectors are fulfilled if 
and only if there is a tensor product representation 



IT' 



1 and 



a 



1- 



• V belongs to the no-signaling set if P(a\x, y) = 
P(a\x) and P(b\x, y) = P(b\y) for all a, b, x, y. This 
set is also a polytope. Clearly the classical set 
is included in the quantum set, which is included 
in the no-signaling set. Notice also that the no- 
signaling constraints reduce the number of values 
required to specify a probability point V down to 

\X\\y\{\A\-\){\B\-\)^X\{\A\-\)^y\{\B\-\). 

In this framework, we are looking for a physical prin- 
ciple, which would single out the quantum set within the 
no-signaling polytope. 

Before continuing, we want to stress a difference with 
other operational approaches, in particular with the line 
of research on axiomatics |(|. There, a lot is built on 
the assumption of tomography: it is supposed that some 
given P's are known to carry all the information the sys- 
tem. This is physically possible if the degrees of free- 
dom under study and the measurements that are being 
performed on it have been characterized. Here, on the 
contrary, we work in a completely black-box scenario, ul- 
timately the same as in Bell's theorem and in device- 
independent assessments Q ■ In such a scenario, the point 
V can never be claimed to be "the state" , with the idea of 
complete information that this term conveys. Rather, V 
encodes just the information that can be gathered from 
the black boxes. This is also one of the reasons why we 
start out with bipartite systems: in a black-box scenario, 
the behavior of a single system can always be described 
in terms of hidden variables. 



III. INFORMATION CAUSALITY: THE TASK 



The statement of "no-signaling" is the impossibility 
of a task, namely, sending any amount of information 
by sampling a bipartite probability distribution. Every 
device independent principle must have a task (an infor- 
mation processing protocol) and a statement about it. In 




FIG. 2: Implementation of perfect oblivious transfer using the 
PR-box and one bit of communication. 



this section we aim at explaining the choice of the task 
and the statement of Information Causality. We start 
by asking the question: in what sense the PR-box is to 
powerful? 



A. The power of the PR-box 



The first device independent principle that put some 
bounds on the winning probability of the CHSH game 
was that of nontrivial communication complexity |8| . It 
has been shown that the access to perfect PR-boxes al- 
lows two parties to solve any communication complexity 
problem with the transmission of a single bit. Later this 
result has been improved in [9| where it was shown that 
the same happens even if the boxes are a little noisy, i.e. 
they allow for the success probability in the CHSH game 
greater than ~ 0.908. The question whether this 

principle can be used to derive even stronger limits is still 
open. 

The simple idea behind taking this approach to study 
nonlocality is that if nothing seems to be wrong with the 
PR-boxes if the parties are not communicating (and no 
communication must be the case if we would like to use 
the no-signaling principle) then maybe there is something 
wrong with them when the communication takes place. 
To see why this should be the case let us put ourselves in 
the place of Bob, the owner of one part of the PR-box. 
When we choose our setting to be y = we know that 
the outcome of our part of the box is going to be equal 
to the outcome of Alice b = a. If we choose y — 1 instead 
then we can expect b = a© x. We see that we can choose 
to learn any one of the two independent bits a or a© x by 
choosing different settings. Granted that these two bits 
are perfectly random, but their randomness is the same. 
What we mean by that is that both of them are generated 
by XORing something deterministic (i.e. 0) or controlled 
by Alice (i.e. x) with the same random bit a. This is 
important because it allows, by transmitting later only a 
single bit form Alice to Bob, to erase the randomness in 
any of the bits that Bob might want to get regardless of 
his choice. 



This property of the PR-box has been exploited in [T(| 
in the context of oblivious transfer (Figure [5]). Imagine 
that Alice has two bits Xo and X\ . She can send only one 
bit of classical communication to Bob who is interested 
in one of the bits (Alice does not know in which) . Let the 
index of the bit that Bob is interested in be k. If they 
have access to a PR-box they can do this. Alice inputs 
x = xq x\ in her part of the box and, after reading a, 
sends the one bit message m = xo®a to Bob. Bob inputs 
y = k, reads b and computes C = m(Bb = x (Ba(&b. It 
is easy to see that C = Xk- Indeed, if k = then a = b 
and C = xq; if k = 1, then b = a © x and C = xq © x = 

Xq Xo © Xi — X\. 

Earlier we have promised that this analysis will show 
us what goes wrong if we consider the protocols with PR- 
boxes and communication. We are almost there. Look at 
the situation in the Bob's laboratory when he has already 
received Alice's message but he has not yet chosen which 
bit to decode. Considered as a black box his lab now 
has, in some sense, two bits. True that the extraction of 
one will destroy the other but, since any can be decoded, 
they both must be there. But we have transmitted only 
a single bit and the PR-boxes are supposed to be no- 
signaling so they cannot be used to transmit the other. 
Somehow the amount of information that the lab of Bob 
has is larger than the amount it received. Things like 
this should not happen. 
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FIG. 3: The task of Information Causality is the same as the 
one that defines a Random Access Code. Alice receives N 
bits, and Bob is asked to guess one of them. Alice is allowed to 
send a message m that carries M bits, where M < N to avoid 
trivialities. Moreover, Alice and Bob can share a no-signaling 
resource — and in fact, in all this study the goal is to compare 
the power of such resources. The usual figure of merit is the 
success probability p = J^i—lo 1 Prob(/3fc = Xk\k); Information 
Causality rather quantifies the amount of information that is 
potentially available in Bob's location. 

of the RAC is equivalent to finding the bound on the 
probability to win the CHSH game. 



C. Task and statement of Information Causality 



B. Random Access Codes 



The protocol that we have just described is called 
(2,1,1) Random Access Code (RAC) [n}. It allows Alice 
to encode two bits xq and x\ into a single bit message m 
in such a way that Bob can decode any bit he chooses to. 
The notion generalizes to that of (N,M,p) RAC, which 
allows Alice to encode N bits into M bit message in such 
a way that the worst case probability of Bob decoding 
any of these bits correctly is p [3l|. We can talk here 
as well about the average success probability instead of 
the worst case since Yao's principle [l2j applied to RACs 
allows, with the use of shared randomness, to make these 
two equal [Hj]. There arc many different types of RACs 
with slightly different properties which depend on the re- 
sources that we allow to be used. The most important 
distinction among the known codes lies in what is being 
communicated (classical bits or qubits). 

In the code presented above the bits are decoded cor- 
rectly as long as the correlations a = b for y = and 
a = b x for y = 1 are always true. If they occur with 
probability p then the box can win the CHSH game with 
this probability and, at the same time, the average suc- 
cess probability of (2, l,p) RAC is also p. Therefore, we 
see that finding a way to bound the success probability 



We are now in a position to define the task, to which 
the principle of Information Causality is going to apply. 
It is the same as a (N, M,p) Random Access Code, where 
N and M are classical bits (Figure [3]) . Notice that it 
does not matter how this information is encoded: when 
we refer to "sending the M bit message", it should be 
understood as a single use of a channel with classical 
communication capacity M. 

The statement of Information Causality requests that, 
in the task just defined, the amount of information poten- 
tially available to Bob about Alice's input cannot exceed 
M bits. This potentiality is the key to the Information 
Causality's success. If we would consider only the infor- 
mation that Bob actually gets, then this principle would 
be equivalent to no-signaling (indeed, imposing that Bob 
can actually receive only M bits is equivalent to stating 
that any additional resource is no-signaling). However, 
this little tweak makes all the difference as we will see in 
the next section. 



D. The reason for the name 



But before we get to it, we would like to take this 
opportunity and make a short comment on the choice of 
the name for our principle. We do this mainly because 
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several people have asked us for the justification of our 
choice. 

Let us reiterate that Information Causality is about 
forbidding more information to be potentially available 
to the receiver than has been sent by the sender. We 
hope that expressing our principle in that form makes 
the choice of the name clearer. Causality is the ability 
to change something over space-time. In the task we 
are considering, what gets changed is the information 
that Bob has about the particular bits of Alice. Before 
the protocol is run it is, by definition, zero. The cause 
is the transmission of the message, which increases the 
information. The statement about the task is that this 
increase in information is limited. In other words, we are 
putting a limit on the effect that the cause can have in 
the terms of information. Hence the name. 



IV. MATHEMATICS 



A. The figure of merit 



Now we are ready to present the principle of Informa- 
tion Causality (shortened IC from now on) in its formal 
version. There are many different measures of informa- 
tion to choose from but in our case the choice is quite 
obvious. Since the task is about communicating over a 
channel with a specified classical communication capacity 
and because Shannon's celebrated single letter formula 
relates it to mutual information we take this measure. 
Therefore, the amount of information that Bob can po- 
tentially have about the variable Xi of Alice is given by 
I(xi : Pi) where pi is the random variable that he gen- 
erates when using his optimal procedure for maximizing 
the amount of information about this particular Xi. The 
statement of IC is that 



B. Information Causality holds for quantum 
no-signaling resources 



IC sounds like a reasonable thing to expect from the 
universe but so does locality, determinism or the notion 
of absolute time. Therefore, in the presentation of a new 
principle, there should always be a proof that it is not vi- 
olated by nature. Now we present a proof that IC holds 
in the classical and quantum information theory. We 
focus on quantum correlations because classical correla- 
tions form a subset of quantum correlations. 

Let us denote by ps Bob's part of the shared quantum 
state and x the set of all Alice's variables #j. We begin 
by showing that after receiving the message m, which 
was communicated over the channel with the classical 
communication capacity M, from Alice all the classical 
and quantum information he has does not have more than 
M bits of information about x: 

I(x:m,p B ) < M. (7) 

For the proof we use the chain rule for mutual informa- 
tion, I(x : to, pb) = I(x : ps) + I(x : to|/9b). Since at 
the beginning of the protocol Bob knows nothing about 
the variables of Alice I{x : ps) = 0, and the second term 
I(x : rfi\pB) = I(x, pb ■ to) — I(pb '■ Tn) is bounded by M 
due to the positivity of the mutual information and the 
fact that to is a message sent over the channel with the 
classical communication capacity M. 

In the case of independent Alice's input bits condition 
([7]) limits the information gain about the individual bits 
as well because 

N 

I(x : m, p B ) > ^ I{x l : m, p B )- (8) 
i=i 

This inequality is also proved using the chain rule. Fi- 
nally, we observe that Bob's output bit Pi is obtained 
at the end from rh and ps- Hence, the data processing 
inequality implies I(xi : m, B) > I(xi : Pi) which gives 
us ©. 



^/(x i: ft)<M. (6) 

i = i C. Information-theoretical derivation of the 

Tsirelson bound 



Note that we the variables Xi do not have to be binary. 
We do not make any assumptions about their alphabets. 
The definition of IC that we have given here is slig htly 
stronger than the one given in the original paper [14j . 
There we have assumed that the communication from 
Alice to Bob is over a noiseless classical channel. This 
assumption can be lifted and, as we show in the next 
section, our principle will still hold in the quantum the- 
ory. 



Here we show that any theory which allows for the vio- 
lation of the Tsirelson bound violates also IC. To this end 
we consider a concatenated RAC. Let us explain what we 
mean by this. 

Previously we have presented a code which encodes 
two classical bits into a single one and gives the aver- 
age probability of correct decoding equal to the winning 
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probability of the CHSH game. We may think about it 
as a pair of black boxes. Alice puts two bits into hers and 
it returns a single bit which she sends to Bob. Bob then 
puts this message into his box, makes a choice which 
bit he wants to learn and gets a value which with the 
probability p is equal to the bit he is interested in. Now 
imagine that Alice gets four bits instead of two and she 
is still limited to one bit of communication. She and 
Bob can construct a RAC for this task with the pairs 
of the same boxes they used previously with the help of 
concatenation procedure. It works like this: The parties 
need three pairs of boxes. Alice puts two of her bits into 
her first box and the remaining two into the second. The 
boxes have produced two messages which she does not 
send to Bob but puts into her third box, instead. It is 
the output of this final box that she sends to Bob. He in- 
puts it to his box from the third pair and chooses to learn 
the message generated by the first or the second box of 
Alice. He inputs this message into one of his other boxes 
- the one paired with the box of Alice that generated this 
message, and then he can retrieve the bit. The overall 
success probability is now p 2 + (1 — p) 2 if the success 
probability for each pair of boxes is p. 

The generalization of this procedure is quite straight- 
forward. If the parties use n levels of concatenation (us- 
ing just a single pair of boxes corresponds to n = 1) they 
can encode 2™ bits using 2" — 1 pairs of boxes. The overall 
success probability of decoding the desired bit correctly 
is p n = 1+ 2 , where E is the bias of the probability p 
(i.e.p=^). 

If fa is Bob's best guess of Xt and they are equal with 
the probability p n then I(xi : Pi) = 1 — h(p n ), where h(.) 
is Shannon's binary entropy. By expanding it into the 
Taylor series one gets that 
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1 + E r ' 



> 



E 



2n 



2 In 2 



(9) 



Since only one bit has been communicated, IC implies 



l>X) J 0t:A)>2' 



E 



2n 



1 



21n2 2 hi 2 



2E 2 



(10) 



for any n. This is going to be true only if 2E 2 < 1 or, 
equivalently, E < . This puts a bound on the winning 

probability of the CHSH game p < ^ (l + which is 
exactly the Tsirelson bound. 

Quite straightforward generalization of this method 
can be employed if the probabilities of guessing differ- 
ent bits are different. In [27j it has been used to derive 
the bound on the efficiency of the RAC's 



A' 



(ii) 



where Ei is the bias of the guessing probability for the 
i'th bit. 



D. Entropic approach 



In order to prove that IC holds in quantum mechan- 
ics we have used the properties of mutual information. 
This means that something must go wrong with entropy 
measures for superstrong nonlocal boxes, as indeed was 
discussed shortly after the first IC paper [l5| . In the 
latest development [l6j], it has been shown that all the 
properties necessary for the derivation of IC are conse- 
quences of only two conditions. This means that even 
if we choose a measure of information different than the 
mutual information, the objects exhibiting more nonlo- 
cality than the quantum theory allows will violate at least 
one of these conditions. 

The conditions proposed in flrij are for the entropies 
H(.). The information that object A has about B can 
be defined as for the von Neumann entropies as I(A : 
B) = H(A) + H{B) - H(A, B). The first of the condi- 
tions is consistency: if A is a classical random variable, 
then H(X) is equal to the Shannon entropy of X. The 
second is evolution with an ancilla: for any two systems 
A and B, whenever a transformation is performed on B 
alone, one must have AH(A,B) > AH(B). It can be 
understood as saying that local transformations can only 
destroy correlations not create them. 

Since the consistency condition is nothing more than 
the normalization of the entropy, it must be the sec- 
ond one which is violated by the superstrong nonlocality. 
This provides another characterization of what is wrong 
with no-signaling theories that violate Tsirelson bound: 
even though they cannot instantaneously send informa- 
tion at a distance, they can create correlations which is 
just as unacceptable. 

Recently a slightly generalized version of IC has been 
proposed [28[ • It keeps all its reasonable appeal and leads 
to entropic inequalities that are strictly stronger than in 
the original version. Recall that the reasoning that lead 
us to stating IC included two steps. In the first step, we 
argued that if the Bob's part of the system together with 
the message are treated as a single black box, then the 
information it has about the settings of Alice cannot ex- 
ceed the classical communication capacity of the channel. 
If we associate random variable e with this black box we 
can express this statement formally as 



H{rh) > I(x: e). 



(12) 



In the second step, we argued that the random variable Pi 
is obtained locally from e, therefore the data processing 
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inequality implies 

V, H(xi\Pi)>H(xi\e). (13) 

If we sum up all the inequalities (fT2"j) and (| X3[) and use 
the subadditivity of the entropy we obtain 

H^ + Y.Hia^^Hix), (14) 

i 

which is equivalent to © in the case when the x$ are in- 
dependent. But nothing forces us to sum up all the terms 
with the same weight. In fact, we can use a different one 
for each of the inequalities and get that, for all Wi > 
and every p(e\x), it holds 

w H(m) + ^2 w -iH(a t \f3i) > w I(x : e) + WiH(ai\e\).S) 

i i 

which is strictly stronger than the original IC. It remains 
to be seen if this new version of the principle leads to 
tighter bounds on what is possible in our world and what 
is not. 



V. (UN?)EXPECTED COMPLEXITY 



(recall that the no-signaling polytope lives in an eight- 
dimensional space) . The violation of IC is assessed using 
the same explicit protocol described above, which is not 
guaranteed to be optimal a priori. For some families, IC 
is found to be violated as soon as one leaves the quantum 
set; in other cases, a finite gap is left. Similar results 
have been obtained by studying the probability points 
that admit a Hardy's paradox [19[. 

Adopting an optimistic view on the IC program, one 
may surmise that the gap is only due to the specific pro- 
tocol using concatenated RAC. Indeed, a subsequent pa- 
per showed that this protocol is provably not optimal 
for some points [l8j]. Indeed, some points, which do not 
exhibit a violation of IC under that protocol, can be "dis- 
tilled" to points which do violate IC under the same pro- 
tocol. In other words, if the process of "distillation" is 
added to the protocol, the gap shrinks. However, it is not 
yet fully closed. Notice that, apart from the fact itself of 
belonging to the quantum set, we know don't know any 
sufficient condition for IC to be respected [32j . 

The scary part of it all comes when one realizes that we 
are still speaking of the elementary CHSH scenario: two 
parties, two inputs and two outputs! Quantum physics 
is certainly more than that. What can one say for more 
general scenarios? 



The fact that IC solves the riddle of the Tsirelson 
bound has been considered as a remarkable success. But 
of course, the ultimate goal is far more ambitious: is IC 
the physical principle that defines our quantum universe? 
In other words, does IC define exactly the quantum set 
within the no-signaling polytope, in any scenario? In the 
following, we refer to this scientific quest as to the IC 
program. 

Several subsequent studies have witnessed partial suc- 
cess and lead to a wealth of unanswered questions - 
which are of course also an asset for research, at least as 
long as their complexity does not suffocate the driving 
motivation. In this last section, we review the status of 
the IC program. 



A. Non-isotropic correlations 



The recovery of the Tsirelson bound proves that IC 
defines the quantum set if one considers the single- 
parameter family of "isotropic correlations" , that is, the 
probability points that can be written as a convex com- 
bination of the PR-box and the white noise. In the 
first extension of the basic result, the authors considered 
whether IC defines the whole quantum set in the CHSH 
scenario [l7j . The conclusion is that we don't know yet. 
Specifically, the paper focused on two-parameter families 



B. Comparison with "macroscopic locality" 



The first natural generalization consists in keeping the 
bipartite scenario and enlarging the alphabets of the in- 
puts and/or the outputs of the no-signaling resource. Ob- 
viously, this can in principle be done by keeping the task 
as a RAC involving bits. For simplicity, though, the only 
larger-alphabet study published so far [2l[ generalized 
also the task to a RAC in which Alice receives N classical 
dits and send M = 1 classical dit to Bob. The underlying 
no-signaling resources are such that \X\ = \A\ = \B\ = d, 
while |y| = 2. 

The main result of this paper is the observation that 
IC comes closer to defining the quantum set than does 
macroscopic locality (ML). The latter is another crite- 
rion proposed with a similar scope [22| ■ It basically says 
that, in an experiment with many independent sources, 
the coarse-grained statistics should not violate Bell's in- 
equalities. For instance, imagine a down-conversion ex- 
periment in which one would not be able to count pho- 
tons and had to rely on proportional counting: then the 
observed currents and their fluctuations could be com- 
patible with a classical source. 

The correlations that satisfy ML have been character- 
ized completely: they form a set which is close, but not 
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identical, to the quantum set. Therefore, it is a necessary 
condition for the IC program to succeed, that IC can rule 
out more correlations than ML does. Reference [2l| pro- 
vides examples of correlations for which it is indeed the 
case. 



tasks consists in finding a natural generalization of the 
IC task to more parties; at the moment of writing, the 
unpublished attempts we are aware of have not lead to 
anything interesting. 



C. IC and multi-partite correlations 



Complexity is further increased if one moves from 
bipartite to multipartite situations. Even in the sim- 
plest tripartite scenario (two inputs and two outputs per 
party), the structure of the no-signaling polytope is ap- 
palling ||. 

One can certainly take multipartite boxes and use 
them as underlying no-signaling resource in a bipartite 
scenario: for instance, in the tripartite case, Alice may 
hold two of the input-output ports and even wire them 
together, while Bob keeps the third port. This has been 
tried, and the result is somehow expected: bipartite IC is 
powerful enough rule out many examples of non-quantum 
points [llli but not all. In fact, two different examples 
have been reported of tripartite probability points, which 
arc definitely not quantum but which exhibit classical 
behavior in any bipartite scenario J2|| [26} . Therefore, in 
order to pursue the IC program, one of the most urgent 



VI. CONCLUSION 



Formulated just two years ago, Information Causality 
has immediately attracted the attention of the scientific 
community. The reason for this success may be purely so- 
ciological: the idea that physics may be defined in terms 
of information processing has been lingering for many 
years and IC came to fill in the expectation. But we pre- 
fer to think in more "objective" terms: as we were trying 
to argue all along this text, IC is a very sensible thing to 
assume about the universe. 

Improvement on the initial study have proved techni- 
cally challenging: often restricted to extremely specific 
examples, they have nevertheless provided interesting in- 
formation about the power of the notion of IC and un- 
raveled some of its complex features. A few more of these 
specific studies will certainly be welcome; but if the IC 
program has to succeed, one will have to find a much 
more comprehensive approach. It is our sincere wish that 
this short review be outdated soon. 
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there is one more requirement: Bob after choosing to de- 
code one bit cannot learn anything about the other. In 
RAC there is no such assumption although in the optimal 
RACs it is always the case. 

[32] A sufficient condition for IC to hold has been given [2(il [. 
but for a fixed protocol (how to use the no-signaling re- 
source, coding of the signal bit etc.); it is therefore of 
limited scope. 



